Optimizing Aggregate Query Processing in Cloud Data Warehouses

نویسندگان

  • Swathi Kurunji
  • Tingjian Ge
  • Xinwen Fu
  • Benyuan Liu
  • Amrith Kumar
  • Cindy X. Chen
چکیده

In this paper, we study and optimize the aggregate query processing in a highly distributed Cloud Data Warehouse, where each database stores a subset of relational data in a star-schema. Existing aggregate query processing algorithms focus on optimizing various query operations but give less importance to communication cost overhead (Two-phase algorithm). However, in cloud architectures, the communication cost overhead is an important factor in query processing. Thus, we consider communication overhead to improve the distributed query processing in such cloud data warehouses. We then design query-processing algorithms by analyzing aggregate operation and eliminating most of the sort and group-by operations with the help of integrity constraints and our proposed storage structures, PK-map and Tuple-index-map. Extensive experiments on PlanetLab cloud machines validate the effectiveness of our proposed framework in improving the response time, reducing node-to-node interdependency, minimizing communication overhead, and reducing database table access required for aggregate query.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimizing Communication for Multi-Join Query Processing in Cloud Data Warehouses

In this paper, we present storage structures, PK-map and Tuple-index-map, to improve the performance of query execution and inter-node communication in Cloud Data Warehouses. Cloud Data Warehouses require Read-Optimized databases because large amount of historical data are integrated on a regular basis to facilitate analytical applications for report generation, future analysis, and decision-ma...

متن کامل

Rewriting OLAP Queries Using Materialized Views and Dimension Hierarchies in Data Warehouses

OLAP queries involve a lot of aggregations on a large amount of data in data warehouses. To process expensive OLAP queries efficiently, we propose a new method for rewriting a given OLAP query using various kinds of materialized aggregate views which already exist in data warehouses. We first define the normal forms of OLAP queries and materialized views based on the lattice of dimension hierar...

متن کامل

S4: A New Secure Scheme for Enforcing Privacy in Cloud Data Warehouses

Outsourcing data into the cloud becomes popular thanks to the pay-as-you-go paradigm. However, such practice raises privacy concerns. The conventional way to achieve data privacy is to encrypt sensitive data before outsourcing. When data are encrypted, a tradeoff must be achieved between security and efficient query processing. Existing solutions that adopt multiple encryption schemes induce a ...

متن کامل

Scalable real-time OLAP on cloud architectures

In contrast to queries for on-line transaction processing (OLTP) systems that typically access only a small portion of a database, OLAP queries may need to aggregate large portions of a database which often leads to performance issues. In this paper we introduce CR-OLAP, a scalable Cloud based Real-time OLAP system based on a new distributed index structure for OLAP, the distributed PDCR tree. ...

متن کامل

Posse: A Framework for Optimizing Incremental View Maintenance at Data Warehouses?

We propose the Posse 1 framework for optimizing incremen-tal view maintenance at data warehouses. To this end, we show how for a particular method of consistent probing it is possible to have the power of SQL view queries with multiset semantics, and at the same time have available a spectrum of concurrency from none at all as in previously proposed solutions to the maximum concurrency obtained...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014